40 research outputs found

    On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

    Full text link
    This paper introduces a new method for multi-channel time domain speech separation in reverberant environments. A fully-convolutional neural network structure has been used to directly separate speech from multiple microphone recordings, with no need of conventional spatial feature extraction. To reduce the influence of reverberation on spatial feature extraction, a dereverberation pre-processing method has been applied to further improve the separation performance. A spatialized version of wsj0-2mix dataset has been simulated to evaluate the proposed system. Both source separation and speech recognition performance of the separated signals have been evaluated objectively. Experiments show that the proposed fully-convolutional network improves the source separation metric and the word error rate (WER) by more than 13% and 50% relative, respectively, over a reference system with conventional features. Applying dereverberation as pre-processing to the proposed system can further reduce the WER by 29% relative using an acoustic model trained on clean and reverberated data.Comment: Presented at IEEE ICASSP 202

    On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

    Full text link
    In this paper, we explore an improved framework to train a monoaural neural enhancement model for robust speech recognition. The designed training framework extends the existing mixture invariant training criterion to exploit both unpaired clean speech and real noisy data. It is found that the unpaired clean speech is crucial to improve quality of separated speech from real noisy speech. The proposed method also performs remixing of processed and unprocessed signals to alleviate the processing artifacts. Experiments on the single-channel CHiME-3 real test sets show that the proposed method improves significantly in terms of speech recognition performance over the enhancement system trained either on the mismatched simulated data in a supervised fashion or on the matched real data in an unsupervised fashion. Between 16% and 39% relative WER reduction has been achieved by the proposed system compared to the unprocessed signal using end-to-end and hybrid acoustic models without retraining on distorted data.Comment: Accepted to INTERSPEECH 202

    Mutation in the silencing gene S/R4 can delay aging in S. cerevisiae

    Get PDF
    AbstractAging in S. cerevisiae is exemplified by the fixed number of cell divisions that mother cells undergo (termed their life span). We have exploited a correlation between life span and stress resistance to identify mutations in four genes that extend life span. One of these, SIR4, encodes a component of the silencing apparatus at HM loci and telomeres. The sir4-42 mutation extends life span by more than 30% and is semidominant. Our findings suggest that sir4-42 extends life span by preventing recruitment of the SIR proteins to HM loci and telomeres, thereby increasing their concentration at other chromosomal regions. Maintaining silencing at these other regions maybe critical in preventing aging. Consistent with this view, expression of only the carboxyl terminus of SIR4 interferes with silencing at HM loci and telomeres, which also extends life span. Possible links among silencing, telomere maintenance, and aging in other organisms are discussed

    A Multiple-stage Simulation-Based Mixed Integer Nonlinear Programming Approach for Supporting Offshore Oil Spill Recovery with Weathering Processess

    Get PDF
    As one of the most commonly used technologies in offshore oil spill response, skimming is facing challenges in recovering the spilled oil in the north region due to cold weather and harsh marine conditions. It is valuable to simulate and optimize the skimming process to improve efficiency of oil skimming during emergency response especially in harsh offshore environments. However, no studies have reported on integrating optimization and simulation approaches to support the offshore oil spill recovery by skimmers. This study developed a multiple-stage simulation based mixed integer nonlinear programming (MSINP) approach to provide sound decisions for skimming spilled oil in a fast, dynamic and cost-efficient manner, which is especially helpful to harsh environments. In the case study, regression models were developed to simulate the efficiencies of two drum skimmers based on the referenced performance tests. The models were further integrated with the optimization methods to determine the optimal strategy to achieve the maximum oil recovery with constraints of time and resources. The results indicated a 96% recovery efficiency based on the optimal settings. Furthermore, the approach was also tested with the integration of the oil weathering processes (e.g., evaporation, emulsification, and dispersion). The results indicated that with the consideration of evaporation and dispersion, in order to achieve the maximum oil recovery, the optimal setting for the oil recovery would be 5 sets of SK1 and 15 sets of SK2, yielding an oil recovery efficiency of 91.5%. The proposed approach was able to efficiently incorporate the regression models and optimization into the same framework and to support efficient skimming for offshore oil spills. The MSINP approach can timely and effectively support offshore oil recovery operations under dynamic conditions and therefore provide expeditious decision-making support during offshore oil spill response in harsh environments

    Time-domain Multi-channel Speech Separation for Overlapping Speech Recognition

    Get PDF
    Despite the recent progress of automatic speech recognition (ASR) driven by deep learning, conversational speech recognition using distant microphones is still challenging. In natural environments, an utterance recorded by distant microphones is corrupted by noise and reverberation, and overlapped by competing speakers, which degrade the speech recognition performance. Speech separation techniques aim to recover individual sources from a noisy mixture, and have been shown beneficial to robust ASR. Deep-learning based separation approaches using a single microphone have moved towards directly processing time-domain signals and outperformed time-frequency domain approaches. When multiple microphones are available, spatial information has been demonstrated to be beneficial for separation. This thesis investigates deep-learning based approaches for time-domain separation using multiple microphones. The designed system is further applied to overlap speech recognition in noisy environments. Three major contributions are summarised as follows. Firstly, a fully-convolutional multi-channel time-domain separation network is developed. The system uses a neural network to automatically learn spatial features from multiple recordings. Different network architectures and multi-stage separation are also considered for the system design. Experiments show that the proposed system achieves better separation and recognition performance over a conventional time-frequency domain approach. Next, the time-domain separation system is extended to a speaker extraction system, which employs speaker identity information. A two-stage speaker conditioning mechanism is proposed to efficiently inform the speaker information to the extraction system. The proposed extraction system can simultaneously output multiple corresponding sources from a noisy mixture and further improve the recognition performance over the blind separation approach. The third contribution studies unsupervised and semi-supervised learning approaches to establish a separation system in situations where only a limited amount of clean data is accessible. An existing unsupervised training strategy that trains a separation system to predict mixtures is improved by exploiting teacher-student learning approaches in this work

    Understanding changes in Sino-U.S. relations from a historical perspective

    No full text

    Transcriptomic variation of the flower-fruit transition inPhysalisandSolanum

    No full text
    Main conclusion Gene expression variations in response to fertilization betweenPhysalisandSolanummight play essential roles in species divergence and fruit evolution. Fertilization triggers variation in fruit development and morphology. The Chinese lantern, a morphological novelty derived from the calyx, is formed upon fertilization inPhysalisbut is not observed inSolanum. The underlying genetic variations are largely unknown. Here, we documented the developmental and morphological differences in the flower and fruit betweenPhysalis floridanaandSolanum pimpinellifoliumand then evaluated both the transcript sequence variation and gene expression at the transcriptomic level at fertilization between the two species. InPhysalistranscriptomic analysis, 468 unigenes were identified as differentially expressed genes (DEGs) that were strongly regulated by fertilization across 3 years. In comparison with tomato, 14,536 strict single-copy orthologous gene pairs were identified betweenP. floridanaandS. pimpinellifoliumin the flower-fruit transcriptome. Nine types of gene variations with specific GO-enriched patterns were identified, covering 58.82% orthologous gene pairs that were DEGs in either trend or dosage at the flower-fruit transition between the two species, which could adequately distinguishSolanumandPhysalis, implying that differential gene expression at fertilization might play essential roles during the divergence and fruit evolution ofSolanum-Physalis. Virus-induced gene silencing analyses revealed the developmental roles of some transcription factor genes in fertility, Chinese lantern development, and fruit weight control inPhysalis. This study presents the first floral transcriptomic resource ofPhysalis, and reveals some candidate genetic variations accounting for the early fruit developmental evolution inPhysalisin comparison toSolanum
    corecore